PARQ quantizer support for torchao's weight-only configs #2091

lisjin · 2025-04-21T22:37:42Z

This is the first step in supporting torchao.quantize_ for PARQ trained models. I target only Int4WeightOnlyConfig and IntxWeightOnlyConfig for now since PARQ does not have activation quantization.

Instead of converting the state (e.g., scale, zero point) from PARQ's existing quantizers to torchao format, I decided to create a new quantizer UnifTorchaoQuantizer. This quantizer calls torchao's quantization primitives choose_qparams_affine, quantize_affine, dequantize_affine to ensure parity between the two QAT methods.

@metascroy It would be great if you could check the correctness of how the quantizer in TestUnifTorchaoQuantizer.test_intx_weight_only is initialized. I'm not sure if I missed any subtleties with int8.

pytorch-bot · 2025-04-21T22:37:46Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2091

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 073e1fa with merge base e3db2b2 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

test/prototype/test_parq.py

metascroy · 2025-04-22T22:21:54Z

@lisjin can you give a little code snippet of our QAT prepare/convert would work for this API?

I'm having trouble following. Here are some example code snippets from other APIs: https://fb.workplace.com/groups/pytorch.edge2.team/permalink/1186139489308568/

test/prototype/test_parq.py

torchao/prototype/parq/optim/quantopt.py

andrewor14 · 2025-04-24T21:30:26Z

Hi @lisjin, do you mind adding a code snippet on the main README on what the end-to-end flow would look like? My understanding is you can just replace the LSBQuantizer there with your new UnifTorchaoQuantizer. Then what happens after training? Do we call quantize_(model, Int4WeightOnlyConfig) as before? Would be good to clarify

torchao/prototype/parq/quant/uniform_torchao.py

lisjin · 2025-04-25T17:48:51Z

@andrewor14 Thanks for the feedback—I removed config from UnifTorchaoQuantizer. In the README, I've also added a side-by-side comparison of PARQ vs. torchao prepare and convert steps. After PARQ training, we call optimizer.torchao_quantize_(model, config). Let me know if there's anything missing.

andrewor14 · 2025-04-25T18:34:06Z

Looks great, thanks @lisjin! The README is very clear.

One thing I want to discuss is whether we can just use a new PARQConfig instead so the PARQ flow looks more like the existing torchao QAT flow. This is current convert flow in the PR now:

config = IntXWeightOnlyConfig(weight_dtype=torch.int4, granularity=PerGroup(32))
optimizer.torchao_quantize_(model, config)

What do you think about something like this instead?

inner_config = IntXWeightOnlyConfig(weight_dtype=torch.int4, granularity=PerGroup(32))
parq_config = PARQConfig(optimizer, inner_config)
quantize_(model, parq_config)

Also curious if @metascroy has any thoughts on this

torchao/prototype/parq/optim/quantopt.py

andrewor14

Looks good to me other than the recursion comment. @metascroy any other thoughts?

metascroy · 2025-04-26T01:23:56Z

Looks good to me! Thanks @lisjin!

Can we add an end-to-end test_intx_weight_only_e2e for intx (with various x-values), similar to test_int4_weight_only_e2e?

torchao/prototype/parq/quant/uniform_torchao.py

test/prototype/test_parq.py

lisjin · 2025-04-28T14:14:24Z

test/prototype/test_parq.py

+    @common_utils.parametrize("b", [2, 3, 4, 8])
+    def test_intx_weight_only_e2e(self, b: int = 2, group_size: int = 32):


@metascroy Thanks for looking it over! I've added this end-to-end test, along with mapping_type=MappingType.SYMMETRIC and target_dtype=torch.int8 defaults for UnifTorchaoQuantizer

torchao/prototype/parq/README.md

facebook-github-bot · 2025-04-28T15:40:26Z

@lisjin has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

lisjin requested review from andrewor14, jerryzh168 and metascroy April 21, 2025 22:37

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 21, 2025

lisjin added the topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) label Apr 21, 2025

lisjin force-pushed the parq branch 4 times, most recently from 48520cb to bea111c Compare April 22, 2025 15:22

metascroy reviewed Apr 22, 2025

View reviewed changes

test/prototype/test_parq.py Show resolved Hide resolved

metascroy reviewed Apr 22, 2025

View reviewed changes

test/prototype/test_parq.py Outdated Show resolved Hide resolved

metascroy reviewed Apr 22, 2025

View reviewed changes

test/prototype/test_parq.py Outdated Show resolved Hide resolved

lisjin added 5 commits April 23, 2025 14:16

Add parq.quant.UnifTorchaoQuantizer for quantize_ API equivalence

e3532e4

Test IntxWeightOnlyConfig

bc0e52a

Formatting fix

df8867f

Per-row IntxWeightOnlyConfig test

1d3d7d9

Add end-to-end QAT prepare/convert test case

b5d83bb

lisjin force-pushed the parq branch from bea111c to b5d83bb Compare April 23, 2025 21:17

lisjin commented Apr 23, 2025

View reviewed changes

test/prototype/test_parq.py Show resolved Hide resolved

lisjin commented Apr 23, 2025

View reviewed changes

test/prototype/test_parq.py Outdated Show resolved Hide resolved

lisjin requested a review from metascroy April 23, 2025 21:27

Merge remote-tracking branch 'pytorch/main' into parq

5ac8a9b

lisjin force-pushed the parq branch 2 times, most recently from f6362f7 to fb7f521 Compare April 24, 2025 02:08

Pass explicit layout to int4_weight_only

d7710cf

lisjin force-pushed the parq branch from fb7f521 to d7710cf Compare April 24, 2025 04:41

lisjin commented Apr 24, 2025

View reviewed changes

torchao/prototype/parq/optim/quantopt.py Outdated Show resolved Hide resolved

Add QuantOptimizer.torchao_quantize_

6130cc2

lisjin force-pushed the parq branch from 81ba287 to 6130cc2 Compare April 24, 2025 16:36

andrewor14 reviewed Apr 24, 2025

View reviewed changes

torchao/prototype/parq/quant/uniform_torchao.py Outdated Show resolved Hide resolved

Merge remote-tracking branch 'pytorch/main' into parq

667dfe7

lisjin requested a review from andrewor14 April 25, 2025 17:48

lisjin force-pushed the parq branch from d7e380f to 25814b9 Compare April 25, 2025 18:11

Update README, add Int4UnifTorchaoQuantizer

9e70d6d

lisjin force-pushed the parq branch from 25814b9 to 9e70d6d Compare April 25, 2025 18:14

lisjin commented Apr 25, 2025

View reviewed changes

torchao/prototype/parq/optim/quantopt.py Outdated Show resolved Hide resolved

andrewor14 reviewed Apr 25, 2025

View reviewed changes

torchao/prototype/parq/optim/quantopt.py Outdated Show resolved Hide resolved

andrewor14 approved these changes Apr 25, 2025

View reviewed changes

metascroy reviewed Apr 26, 2025

View reviewed changes

torchao/prototype/parq/quant/uniform_torchao.py Outdated Show resolved Hide resolved

metascroy reviewed Apr 26, 2025

View reviewed changes

torchao/prototype/parq/quant/uniform_torchao.py Outdated Show resolved Hide resolved

lisjin added 2 commits April 28, 2025 05:48

Merge remote-tracking branch 'pytorch/main' into parq

490bdf6

Add test_intx_weight_only_e2e, set UnifTorchaoQuantizer defaults

3809add

lisjin force-pushed the parq branch from 6c1b813 to 3809add Compare April 28, 2025 14:07

lisjin commented Apr 28, 2025

View reviewed changes

test/prototype/test_parq.py Outdated Show resolved Hide resolved

lisjin commented Apr 28, 2025

View reviewed changes

lisjin requested review from andrewor14 and metascroy April 28, 2025 14:14

andrewor14 reviewed Apr 28, 2025

View reviewed changes

torchao/prototype/parq/README.md Outdated Show resolved Hide resolved

Update PARQ README

073e1fa

lisjin force-pushed the parq branch from 313c015 to 073e1fa Compare April 28, 2025 14:32

lisjin merged commit 8334340 into pytorch:main Apr 28, 2025
19 of 20 checks passed

lisjin deleted the parq branch April 28, 2025 17:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PARQ quantizer support for torchao's weight-only configs #2091

PARQ quantizer support for torchao's weight-only configs #2091

lisjin commented Apr 21, 2025

pytorch-bot bot commented Apr 21, 2025 •

edited

Loading

metascroy commented Apr 22, 2025

andrewor14 commented Apr 24, 2025

lisjin commented Apr 25, 2025 •

edited

Loading

andrewor14 commented Apr 25, 2025

andrewor14 left a comment

metascroy commented Apr 26, 2025

lisjin Apr 28, 2025

facebook-github-bot commented Apr 28, 2025

		@common_utils.parametrize("b", [2, 3, 4, 8])
		def test_intx_weight_only_e2e(self, b: int = 2, group_size: int = 32):

PARQ quantizer support for torchao's weight-only configs #2091

PARQ quantizer support for torchao's weight-only configs #2091

Conversation

lisjin commented Apr 21, 2025

pytorch-bot bot commented Apr 21, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2091

✅ No Failures

metascroy commented Apr 22, 2025

andrewor14 commented Apr 24, 2025

lisjin commented Apr 25, 2025 • edited Loading

andrewor14 commented Apr 25, 2025

andrewor14 left a comment

Choose a reason for hiding this comment

metascroy commented Apr 26, 2025

lisjin Apr 28, 2025

Choose a reason for hiding this comment

facebook-github-bot commented Apr 28, 2025

pytorch-bot bot commented Apr 21, 2025 •

edited

Loading

lisjin commented Apr 25, 2025 •

edited

Loading